data <- read.csv("processed_dataset.csv")
results <- analyze(data)
plot(results)
analysed_data <-
read_analysed_data("analysed_data.txt")
plot(analysed_data)Data Analysis and Bioinformatics
2024-02-12
R analysis .r
.rmd
.rmd
Here's why I needed to preprocess.
```{bash}
sed 's/pattern/replacement/' dataset.csv > processed_dataset.csv
```
Here's how I used the data
```{r}
data <- read.csv("processed_dataset.csv")
results <- analyze(data)
plot(results)
```
Here's results from a python package
```{python}
data = read_data("input_data.csv")
analysis = analyze_with_python_package(data)
```
Final plots
```{r}
plot(py$analysis)
```
Conclusions...RMarkdown lets you describe both what you did by programming in different languages and why you did it in the same document.
commit 41645e88a78cc41f43c65a04931fc5ec2b34dacb
Author: James Eapen <james.eapen@vai.org>
Date: Tue Feb 6 11:23:57 2024 -0500
fix rmarkdown guide preface link
diff --git a/session_5_rmarkdown/README.md b/session_5_rmarkdown/README.md
index 0f1ed51..4e5bf4f 100644
--- a/session_5_rmarkdown/README.md
+++ b/session_5_rmarkdown/README.md
@@ -36,7 +36,7 @@
have to remember everything from this - treat it as a reference document
during class and for your homework.
- - [Preface to R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/installation.html)
+ - [Preface to R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown)
## Additional Resources
commit e7bd2731f703980f9104ecc342c9efc4c40d3364
Author: James Eapen <james.eapen@vai.org>
Date: Mon Feb 5 15:21:56 2024 -0500
add session5_rmarkdown pre-class and slides
diff --git a/session_5_rmarkdown/README.md b/session_5_rmarkdown/README.md
new file mode 100644
index 0000000..0f1ed51
--- /dev/null
+++ b/session_5_rmarkdown/README.md
@@ -0,0 +1,92 @@
+# R Markdown
+
+## Learning Objectives
+
+1. Work with Rmarkdown files in RStudio for generating reports and knit them to
+ PDF, HTML, and Word documents.
+
+1. Include R and bash code chunks to run statistical analyses and generate plots
+
+1. Configure how code chunks and plots are knit to the final output document.
+
+## Pre-class assignment
+
+1. Make sure you have the `rmarkdown`, `knitr`, and `tinytex` packages installed
+ by running the following code in the RStudio console. If the result is not
+ TRUE for each, install the missing one with `install.packages("[package
+ name]")`.
+
+ ```r
+ > c('rmarkdown', 'knitr', 'tinytex') %in% installed.packages()
+ # [1] TRUE TRUE TRUE
+ ```
+
+1. Watch these short videos:
+
+ - [RMarkdown](https://vimeo.com/178485416)
+
+ - [A reproducible workflow](https://www.youtube.com/watch?v=s3JldKoA0zw): a
+ little dramatic, but gets the point across
+
+2. Read the following:
+
+ - [Getting started](https://www.markdownguide.org/getting-started/)
+
+ - [Basic syntax](https://www.markdownguide.org/basic-syntax/): You don't
+ have to remember everything from this - treat it as a reference document
+ during class and for your homework.
+
+ - [Preface to R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/installation.html)
+
+## Additional Resources
+
+- [Markdown guide written by its creator](https://daringfireball.net/projects/markdown/basics)
+
+- [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/): a
+ practical guide to all that you can do in R Markdown written by a statistician
+ who co-authored the [rmarkdown package](https://github.com/rstudio/rmarkdown).
+
+- [Common Problems with rmarkdown (and some
+ solutions)](https://rmd4sci.njtierney.com/common-problems-with-rmarkdown-and-some-solutions.html)
+
+- [Rmarkdown cheatsheet](https://raw.githubusercontent.com/rstudio/cheatsheets/main/rmarkdown-2.0.pdf)
+
+### What can you do with (R)Markdown
+
+#### Reports
+
+- 2021 VAIGS Biostatistics project: <https://jamespeapen.github.io/expdesign/p1final.html>
+
+- <https://svmiller.com/blog/2023/01/what-log-variables-do-for-your-ols-model>
+
+#### PhD theses
+
+These use the [oxforddown](https://github.com/ulyngs/oxforddown) package which
+is based on [bookdown](https://bookdown.org/).
+
+- <https://thesis.shirdekel.com/thesis.pdf>
+
+- <https://ulyngs.github.io/phd-thesis/_main.pdf>
+
+- <https://gsarti.com/thesis/Sarti_2020_Interpreting_NLMs_for_LCA.pdf>
+
+#### Presentations
+
+- <https://meghan.rbind.io/slides/neair/neair.html#/title-slide>
+
+- [RLadies presentations](https://github.com/rladies/rladies_global_presentations)
+
+#### Websites
+
+- <https://rmarkdown.rstudio.com/>
+
+- <https://yihui.org/en>
+
+- <https://svmiller.com>
+
+- <http://www.jenniferbradham.org>
+
+#### CV
+
+- https://svmiller.com/blog/2016/03/svm-r-markdown-cv/
+
Run git log -p [word document]
commit 31d907b06fc8bc89d7148512691c5a56d290b732
Author: Ian Beddows <ianbeddows@c02xg0hvjgh6.vai.org>
Date: Mon Feb 5 11:43:28 2024 -0500
added Jan22_quiz
diff --git a/session_2_Git_and_Github/Jan22_quiz.docx b/session_2_Git_and_Github/Jan22_quiz.docx
new file mode 100644
index 0000000..4f824ca
Binary files /dev/null and b/session_2_Git_and_Github/Jan22_quiz.docx differ
Since markdown is plaintext you can see the version history unlike a Word document which not plaintext
Write a story
code (R, bash, python, …)
plots
styling
version controlled
with, multiple output formats
citations and bibliography
Use markdown guide as a reference
Questions about markdown?
session5_rmarkdown/examples/example.rmd
RMarkdown output is configured using yaml keys and values
---
title: "Document title"
subtitle: "Add subtitle"
date: "Feb 12, 2024"
output:
html_document:
toc: true
pdf_document: # multiple output formats
toc: true
word_document:
reference_docx: reference.docx
---Refer to Rmarkdown guide for reference.docx setup
Error in eval(expr, envir, enclos): object 'does_not_exist' not found
processing file: rmarkdown.qmd
|.... | 8% (unnamed-chunk-2)
|...... | 12% (unnamed-chunk-3)
|..................... | 41% (unnamed_chunk-4) Quitting from lines 271-272 (rmarkdown.qmd)
Error in cat(does_not_exist) : object 'does_not_exist' not found
Without a label it can be hard to figure out where the chunk with the error is
Error in eval(expr, envir, enclos): object 'does_not_exist' not found
processing file: rmarkdown.qmd
|.... | 8% (unnamed-chunk-2)
|...... | 12% (unnamed-chunk-3)
|..................... | 41% (this_named_chunk) Quitting from lines 271-272 (rmarkdown.qmd)
Error in cat(does_not_exist) : object 'does_not_exist' not found
The label identifies the chunk with the error: this_named_chunk
What’s written:
echo: show code?What’s written:
```{r, echo=FALSE}
message("This is a test message")
warning("This is a test warning")
colnames(cars)
```Output:
This is a test message
Warning: This is a test warning
[1] "speed" "dist"
eval: run code?What’s written:
```{r, eval=FALSE}
message("This is a test message")
warning("This is a test warning")
colnames(cars)
```Output:
include: do anything with code and output?What’s written:
Output:
message: show messages?What’s written:
```{r, message=FALSE}
message("This is a test message")
warning("This is a test warning")
colnames(cars)
```Output:
warning: show warnings?What’s written:
```{r, warning=FALSE}
message("This is a test message")
warning("This is a test warning")
colnames(cars)
```Output:
This is a test message
[1] "speed" "dist"
Be careful when ignoring warnings - only for final draft
Number of rows in cars dataset =
Number of rows in `cars` dataset = `r nrow(cars)`
Number of rows in cars dataset = 50
_ genes from _ patients
`r dim(rna_count_matrix)[1]` genes from `r dim(rna_count_matrix)[2]` patients
3000 genes from 24 patients
One plus One = `r 1 + 1`
One plus One = 2
The cars have a mean speed of `r mean(cars$speed)` mph.
The cars have a mean speed of 15.4 mph.
date: `r Sys.Date()`
2024-02-26
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS/LAPACK: /nix/store/yph2asi2jsbab21sqckalj6kgvd94jgf-blas-3/lib/libblas.so.3; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Detroit
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.3.2 fastmap_1.1.1 cli_3.6.2 tools_4.3.2
[5] htmltools_0.5.7 yaml_2.3.8 rmarkdown_2.25 knitr_1.45
[9] jsonlite_1.8.8 xfun_0.41 digest_0.6.33 rlang_1.1.2
[13] evaluate_0.23
Use example.rmd to make a report using palmerpenguins
plot the body weight against sex and add a caption
Test whether there is difference in mean body weight between male and female penguins
Write a little conclusion of the analysis
Using inline-code:
Add sessionInfo() at the end
Knit to HTML, PDF and Word
install.packages('palmerpenguins')Homework: run another analysis between two variables in the dataset and report
Upload final Rmd to github - with your name in the filename -
under session5_rmarkdown/homework